weighted regression
Explainable AI in Spatial Analysis
A key objective in spatial analysis is to model spatial relationships and infer spatial processes to generate knowledge from spatial data, which has been largely based on spatial statistical methods. More recently, machine learning offers scalable and flexible approach es that complement traditional methods and has been increasingly applied in spatial data science . Despite its advantages, machine learning is often criticized for being a black box, which limits our understanding of model behavior and output . Recognizing this limitation, XAI has emerged as a pivotal field in AI that provides methods to explain the output of machine learning models to enhance transparency and understanding. These methods are crucial for model diagnosis, bias detection, and ensuring the reliability of results obtained from machine learning models. This chapter introduces key concepts and methods in XAI with a focus on Shapley value - based approach es, which is arguably the most popular XAI method, and their integration with spatial analysis. An empirical example of county - level voting behaviors in the 2020 Presidential election is presented to demonstrate the use of Shapley values and spatial analysis with a comparison to multi - scale geograp hically weighted regression . The chapter concludes with a discussion on the challenges and limitations of current XAI techniques and proposes new directions .
- Summary/Review (0.66)
- Research Report (0.64)
- Transportation (0.88)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Voting & Elections (0.49)
Cybercrime Prediction via Geographically Weighted Learning
Khan, Muhammad Al-Zafar, Al-Karaki, Jamal, Mahafzah, Emad
Inspired by the success of Geographically Weighted Regression and its accounting for spatial variations, we propose GeogGNN -- A graph neural network model that accounts for geographical latitude and longitudinal points. Using a synthetically generated dataset, we apply the algorithm for a 4-class classification problem in cybersecurity with seemingly realistic geographic coordinates centered in the Gulf Cooperation Council region. We demonstrate that it has higher accuracy than standard neural networks and convolutional neural networks that treat the coordinates as features. Encouraged by the speed-up in model accuracy by the GeogGNN model, we provide a general mathematical result that demonstrates that a geometrically weighted neural network will, in principle, always display higher accuracy in the classification of spatially dependent data by making use of spatial continuity and local averaging features.
- Asia > China (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States (0.04)
- (7 more...)
Machine Learning for Dynamic Management Zone in Smart Farming
Kulatunga, Chamil, Dhelim, Sahraoui, Kechadi, Tahar
Due to economic and logistic Agriculture 4.0 is using many modern research and technologies reasons, soil sampling are not frequent enough to understand in different aspects of agriculture including genomics, nanotechnology, its impact on annual yield. For example P, K, Mg are tested synthetic proteins, Internet of Things, automation once for three years. However, altitude, soil texture data are not and machine learning [1]. As an important pillar in this space, changed or changed slowly. Based on our data management experience data-driven agriculture has gain a momentum in last twenty in UK farms, yield maps are being collected by many years as a retrofitting mechanism for the available technologies farmers in the last two decades. Most of the analyses have been to feed 9 billion population in 2050. It has become more realistic focused on spatial variability of individual maps. Due to lack of than ever due to wider use of sensors, cloud computing and consecutive number of yield maps and crop rotation complexities, their integration with cyber-physical-social farming systems to both spatio-temporal analysis has been limited so far [? ]. use big data for intuition, intelligence and insights. However, Therefore, many farmers, agronomists and scientists are interested data-driven agriculture is challenging for small actors but important in looking at the relations of those data layers, deriving for global sustainability compared to others industries compound new data layers and accordingly make site-specific such as healthcare, fin-tech and manufacturing.
- Europe > United Kingdom > England (0.14)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
- North America > United States > Michigan (0.04)
- (2 more...)
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
Zhang, Tianle, Guan, Jiayi, Zhao, Lin, Li, Yihang, Li, Dongjiang, Zeng, Zecui, Sun, Lei, Chen, Yue, Wei, Xuelong, Li, Lusong, He, Xiaodong
Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for offline RL issues. However, previous offline RL algorithms based on diffusion policies generally adopt weighted regression to improve the policy. This approach optimizes the policy only using the collected actions and is sensitive to Q-values, which limits the potential for further performance enhancement. To this end, we propose a novel preferred-action-optimized diffusion policy for offline RL. In particular, an expressive conditional diffusion model is utilized to represent the diverse distribution of a behavior policy. Meanwhile, based on the diffusion model, preferred actions within the same behavior distribution are automatically generated through the critic function. Moreover, an anti-noise preference optimization is designed to achieve policy improvement by using the preferred actions, which can adapt to noise-preferred actions for stable training. Extensive experiments demonstrate that the proposed method provides competitive or superior performance compared to previous state-of-the-art offline RL methods, particularly in sparse reward tasks such as Kitchen and AntMaze. Additionally, we empirically prove the effectiveness of anti-noise preference optimization.
An Ensemble Framework for Explainable Geospatial Machine Learning Models
The relationships between things can vary significantly across different spatial or geographical contexts, a phenomenon that manifests in various spatial events such as the disparate impacts of pandemics[1], the dynamics of poverty distribution[2], fluctuations in housing prices[3], etc. By optimizing spatial analysis methods, we can enhance the accuracy of predictions, improve the interpretability of models, and make more effective spatial decisions or interventions[4]. Nonetheless, the inherent complexity of spatial data and the potential for nonlinear relationships pose challenges to enhancing interpretability through traditional spatial analysis techniques.[5]. In terms of models for analyzing spatial varying effects such as spatial filtering models[6-8] and spatial Bayes models [9], Geographically Weighted Regression (GWR) and Multiscale Geographically Weighted Regression (MGWR) stand out for their application of local spatial weighting schemes, which are instrumental in capturing spatial features more accurately[10, 11]. These linear regression-based approaches, however, encounter significant hurdles in decoding complex spatial phenomena (Figure 1). Various Geographically Weighted (GW) models have been developed to tackle issues such as multicollinearity [12, 13] and to extend the utility of GW models to classification tasks[14-17]. The evolution of artificial intelligence (AI) methodologies, including Artificial Neural Networks (ANN) [18], Graph Neural Networks (GNN) [19, 20], and Convolution Neural Networks (CNN) [21], has introduced novel ways to mitigate uncertainties around spatial proximity and weighting kernels in GW models. Despite these advancements in marrying geospatial models with AI, challenges remain in addressing nonlinear correlations and deciphering underlying spatial mechanisms.
- Asia > China > Hubei Province > Wuhan (0.05)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Germany > Berlin (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Covariate-distance Weighted Regression (CWR): A Case Study for Estimation of House Prices
Chu, Hone-Jay, Chen, Po-Hung, Chang, Sheng-Mao, Ali, Muhammad Zeeshan, Patra, Sumriti Ranjan
Geographically weighted regression (GWR) is a popular tool for modeling spatial heterogeneity in a regression model. However, the current weighting function used in GWR only considers the geographical distance, while the attribute similarity is totally ignored. In this study, we proposed a covariate weighting function that combines the geographical distance and attribute distance. The covariate-distance weighted regression (CWR) is the extension of GWR including geographical distance and attribute distance. House prices are affected by numerous factors, such as house age, floor area, and land use. Prediction model is used to help understand the characteristics of regional house prices. The CWR was used to understand the relationship between the house price and controlling factors. The CWR can consider the geological and attribute distances, and produce accurate estimates of house price that preserve the weight matrix for geological and attribute distance functions. Results show that the house attributes/conditions and the characteristics of the house, such as floor area and house age, might affect the house price. After factor selection, in which only house age and floor area of a building are considered, the RMSE of the CWR model can be improved by 2.9%-26.3% for skyscrapers when compared to the GWR. CWR can effectively reduce estimation errors from traditional spatial regression models and provide novel and feasible models for spatial estimation.
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
- Asia > Singapore (0.04)
- Asia > China (0.04)
- (2 more...)
Active Learning with Statistical Models
An active learning problem is one where the learner has the ability or need to influence or select its own training data. Many problems of great practical interest allow active learning, and many even require it. We consider the problem of actively learning a mapping X - Y based on a set of training examples {(Xi,Yi)} l' where Xi E X and Yi E Y. The learner is allowed to iteratively select new inputs x (possibly from a constrained set), observe the resulting output y, and incorporate the new examples (x, y) into its training set. The primary question of active learning is how to choose which x to try next. There are many heuristics for choosing x based on intuition, including choosing places where we don't have data, where we perform poorly [Linden and Weber, 1993], where we have low confidence [Thrun and Moller, 1992], where we expect it
Spatially-Aware Car-Sharing Demand Prediction
Mühlematter, Dominik J., Wiedemann, Nina, Xin, Yanan, Raubal, Martin
In recent years, car-sharing services have emerged as viable alternatives to private individual mobility, promising more sustainable and resource-efficient, but still comfortable transportation. Research on short-term prediction and optimization methods has improved operations and fleet control of car-sharing services; however, long-term projections and spatial analysis are sparse in the literature. We propose to analyze the average monthly demand in a station-based car-sharing service with spatially-aware learning algorithms that offer high predictive performance as well as interpretability. In particular, we compare the spatially-implicit Random Forest model with spatially-aware methods for predicting average monthly per-station demand. The study utilizes a rich set of socio-demographic, location-based (e.g., POIs), and car-sharing-specific features as input, extracted from a large proprietary car-sharing dataset and publicly available datasets. We show that the global Random Forest model with geo-coordinates as an input feature achieves the highest predictive performance with an R-squared score of 0.87, while local methods such as Geographically Weighted Regression perform almost on par and additionally yield exciting insights into the heterogeneous spatial distributions of factors influencing car-sharing behaviour. Additionally, our study offers effective as well as highly interpretable methods for diagnosing and planning the placement of car-sharing stations.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- (4 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
GWRBoost:A geographically weighted gradient boosting method for explainable quantification of spatially-varying relationships
Wang, Han, Huang, Zhou, Yin, Ganmin, Bao, Yi, Zhou, Xiao, Gao, Yong
The geographically weighted regression (GWR) is an essential tool for estimating the spatial variation of relationships between dependent and independent variables in geographical contexts. However, GWR suffers from the problem that classical linear regressions, which compose the GWR model, are more prone to be underfitting, especially for significant volume and complex nonlinear data, causing inferior comparative performance. Nevertheless, some advanced models, such as the decision tree and the support vector machine, can learn features from complex data more effectively while they cannot provide explainable quantification for the spatial variation of localized relationships. To address the above issues, we propose a geographically gradient boosting weighted regression model, GWRBoost, that applies the localized additive model and gradient boosting optimization method to alleviate underfitting problems and retains explainable quantification capability for spatially-varying relationships between geographically located variables. Furthermore, we formulate the computation method of the Akaike information score for the proposed model to conduct the comparative analysis with the classic GWR algorithm. Simulation experiments and the empirical case study are applied to prove the efficient performance and practical value of GWRBoost. The results show that our proposed model can reduce the RMSE by 18.3% in parameter estimation accuracy and AICc by 67.3% in the goodness of fit.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)